Skip to content

Add support for table valued functions for SQL Server #1839

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

aharpervc
Copy link
Contributor

@aharpervc aharpervc commented May 6, 2025

This PR adds support for table valued functions for SQL Server, both inline & multi statement functions. For reference, that's the B & C documentation here: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql?view=sql-server-ver16#b-create-an-inline-table-valued-function

Inline TVF's are defined with AS RETURN, so we have a new CreateFunctionBody::AsReturn variant accordingly. Functions using "AS RETURN" don't have BEGIN/END, so that part of the parsing logic is now conditional. Additionally, the data type parser now supports "RETURNS TABLE" without a table definition.

Multi statement TVF's use named table expressions, so a new NamedTable data type variant was added. I didn't see a great way to integrate this into the existing data type parser (especially without rewinding), so creating this data type happens inside the parse create function logic first by parsing the identifier, then parsing the table definition, then using those elements to produce a NamedTable.

I also added a new test example for each of these scenarios.

@aharpervc aharpervc marked this pull request as ready for review May 6, 2025 19:24
@aharpervc aharpervc changed the title Add support for table valued functions for SQL Serve Add support for table valued functions for SQL Server May 6, 2025
@aharpervc aharpervc force-pushed the mssql-create-tvf branch 3 times, most recently from 3686b1b to db1c9b2 Compare May 6, 2025 23:37
CREATE FUNCTION some_inline_tvf(@foo INT, @bar VARCHAR(256)) \
RETURNS TABLE \
AS \
RETURN (SELECT 1 AS col_1)\
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parentheses are optional for inline tvf return queries, although I think the subquery expr expects/requires them currently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added support for that syntax & added a new test case example

Copy link
Contributor Author

@aharpervc aharpervc May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UNION is also supported in this syntax but not in this current approach due to using parse_select, which is restricted. I'm comfortable leaving that for later

Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took a quick look and left a couple comments, @aharpervc could you rebase on main now that the other PR has landed, in order to remove the extra diff?

Comment on lines 5235 to 5237
if self.peek_keyword(Keyword::AS) {
self.expect_keyword_is(Keyword::AS)?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.peek_keyword(Keyword::AS) {
self.expect_keyword_is(Keyword::AS)?;
}
self.parse_keyword(Keyword::AS);

this looks like it could be simplified as above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

Comment on lines 728 to 731
if fields.is_empty() {
return write!(f, "TABLE");
}
write!(f, "TABLE({})", display_comma_separated(fields))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of using the is_empty we can make the fields value an Option and skip the parenthesis only if fields is None. Otherwise we could run into issues if empty fields is allowed in the other variant of this datatype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

Comment on lines +5229 to +5243
let return_type = if return_table.is_some() {
return_table
} else {
Some(self.parse_data_type()?)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be included in the maybe_parse() logic above? such that the logic returns both the return table and the datatype. based on this condition, they seem to go hand in hand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't see how that would be done. The use of maybe_parse is to get the table name & other parts, but if we don't have that then we should rewind and just parse a typical data type. Do you mean like having another maybe_parse inside the first maybe_parse?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah indeed, in this case its one or the other, so should be fine to leave as is.

@aharpervc aharpervc force-pushed the mssql-create-tvf branch from 1b92ede to 999964c Compare May 9, 2025 15:10
@aharpervc
Copy link
Contributor Author

took a quick look and left a couple comments, @aharpervc could you rebase on main now that the other PR has landed, in order to remove the extra diff?

Done 👍

@aharpervc aharpervc requested a review from iffyio May 9, 2025 15:12
Comment on lines 54 to 59
NamedTable(
/// Table name.
ObjectName,
/// Table columns.
Vec<ColumnDef>,
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NamedTable(
/// Table name.
ObjectName,
/// Table columns.
Vec<ColumnDef>,
),
NamedTable {
/// Table name.
table: ObjectName,
/// Table columns.
columns: Vec<ColumnDef>,
},

we can use an anonymous struct in this manner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes we can. Not sure why I did it that way. Done 👍

Table(Vec<ColumnDef>),
/// [MsSQL]: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql?view=sql-server-ver16#c-create-a-multi-statement-table-valued-function
Table(Option<Vec<ColumnDef>>),
/// Table type with a name, e.g. CREATE FUNCTION RETURNS @result TABLE(...).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a link to the docs that support NamedTable variant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

/// RETURNS TABLE
/// AS RETURN SELECT a + b AS sum;
/// ```
AsReturnSelect(Select),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can also include a reference doc link here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

}));
Ok(DataType::NamedTable(
ObjectName(vec![ObjectNamePart::Identifier(return_table_name)]),
table_column_defs.clone().unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we return an error instead of the unwrap here?

self.expect_keyword_is(Keyword::AS)?;
let return_table = self.maybe_parse(|p| {
let return_table_name = p.parse_identifier()?;
let table_column_defs = if p.peek_keyword(Keyword::TABLE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can replace the if/else statement here with self.expect_keyword(Keyword::TABLE)

let statements = self.parse_statement_list(&[Keyword::END])?;
let end_token = self.expect_keyword(Keyword::END)?;
if table_column_defs.is_none()
|| table_column_defs.clone().is_some_and(|tcd| tcd.is_empty())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm is the clone() here necessary?

let return_table_name = p.parse_identifier()?;
let table_column_defs = if p.peek_keyword(Keyword::TABLE) {
match p.parse_data_type()? {
DataType::Table(t) => t,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think already here we can check that the returned data type is none empty, that would avoid the option invalid case propagating further below (where we have the is_none and is_some checks). e.g.

DataType::Table(Some(t)) if !t.is_empty() => t    

}));
Ok(DataType::NamedTable(
ObjectName(vec![ObjectNamePart::Identifier(return_table_name)]),
table_column_defs.clone().unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here, is the clone necessary?

Comment on lines +5229 to +5243
let return_type = if return_table.is_some() {
return_table
} else {
Some(self.parse_data_type()?)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah indeed, in this case its one or the other, so should be fine to leave as is.

Comment on lines +5260 to +5266
if !matches!(expr, Expr::Subquery(_)) {
parser_err!(
"Expected a subquery after RETURN",
self.peek_token().span.start
)?
}
Some(CreateFunctionBody::AsReturnSubquery(expr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if !matches!(expr, Expr::Subquery(_)) {
parser_err!(
"Expected a subquery after RETURN",
self.peek_token().span.start
)?
}
Some(CreateFunctionBody::AsReturnSubquery(expr))
Some(CreateFunctionBody::AsReturnExpr(expr))

thinking since a subquery is already an expr, we can be more permissive here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a "could" vs "should" situation. The question here is what is the parser expecting? I don't think it makes sense to parse arbitrary expr's if they wouldn't otherwise be allowed by any sql engine... that seems better as a parse failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ideally we would parse statements accordingy to spec but the parser is more permissive in some cases, especially in order to minimize complexity in the parser. In this case we would avoid introducing an extra AsReturnSubquery enum variant, as well as the matches!() check that does the validation after the fact of parsing an expression which isn't ideal. So I think we'd be better off leaving the validation to the downstream crate in this scenario

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yes I suppose that makes sense. It seems odd for the parser to produce output if it's known to be invalid, but I suppose you're saying, the validity question is in the semantics rather than the syntax.

Loosening the implementation to any Expr would probably mean that select .. union .. select syntax would begin parsing, which addresses my concern above.

I'll investigate...

Comment on lines +9836 to +9837
if self.peek_token() != Token::LParen {
Ok(DataType::Table(None))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a comment here explaining what the LParen check is for?

Comment on lines +5260 to +5266
if !matches!(expr, Expr::Subquery(_)) {
parser_err!(
"Expected a subquery after RETURN",
self.peek_token().span.start
)?
}
Some(CreateFunctionBody::AsReturnSubquery(expr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ideally we would parse statements accordingy to spec but the parser is more permissive in some cases, especially in order to minimize complexity in the parser. In this case we would avoid introducing an extra AsReturnSubquery enum variant, as well as the matches!() check that does the validation after the fact of parsing an expression which isn't ideal. So I think we'd be better off leaving the validation to the downstream crate in this scenario

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants